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Abstract —The need to estimate a particular quantile of a 
distribution is an important problem which freqnently arises 
in many computer vision and signal processing applications. 
For example, our work was motivated by the requirements of 
many semi-automatic surveillance analytics systems which detect 
abnormalities in close-circuit television (CCTV) footage using 
statistical models of low-level motion featnres. In this paper 
we specifically address the problem of estimating the running 
quantile of a data stream with non-stationary stochasticity when 
the memory for storing observations is limited. We make several 
major contribntions: (i) we derive an important theoretical resnit 
which shows that the change in the quantile of a stream is 
constrained regardless of the stochastic properties of data, (ii) 
we describe a set of high-level design goals for an effective 
estimation algorithm that emerge as a consequence of our 
theoretical findings, (ill) we introduce a novel algorithm which 
implements the aforementioned design goals by retaining a 
sample of data values in a manner adaptive to changes in 
the distribution of data and progressively narrowing down its 
focus in the periods of quasl-statlonary stochasticity, and (iv) we 
present a comprehensive evaluation of the proposed algorithm 
and compare it with the existing methods in the literature on 
both synthetic data sets and three large ‘real-world’ streams 
acquired in the course of operation of an existing commercial 
surveillance system. Our findings convincingly demonstrate that 
the proposed method is highly successful and vastly ontperforms 
the existing alternatives, especially when the target quantile is 
high valued and the available buffer capacity severely limited. 

I. Introduction 

Quantile estimation is of pervasive importance across a 
variety of signal processing applications. It is used exten¬ 
sively in data mining, simulation modelling d, database 
maintenance, risk management in finance m, m, and the 
analysis of computer network latencies Q, 0, amongst 
others. A particularly challenging form of the quantile estima¬ 
tion problem arises when the desired quantile is high-valued 
(close to unity) and when data needs to be processed as a 
stream, with limited memory capacity. An illustrative practical 
example of when this is the case is encountered in CCTV- 
based surveillance systems 0 . In summary, as various types of 
low-level observations related to events in the scene of interest 
arrive in real-time, quantiles of the corresponding statistics 
for time windows of different durations are needed in order 
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to distinguish ‘normal’ (common) events from those which 
are in some sense unusual and thus require human attention. 
The amount of incoming data is extraordinarily large and the 
capabilities of the available hardware highly limited both in 
terms of storage capacity and processing power. 

A. Previous work 

Unsurprisingly, the problem of estimating a quantile of a set 
has received considerable attention, much of it in the realm 
of theoretical research. In particular, a substantial amount of 
work has focused on the study of asymptotic computational 
complexity of quantile estimation algorithms ifTB . Il2()l . An 
important result emerging from this corpus of work is the 
proof by Munro and Paterson EOll that the working memory 
requirement of any algorithm that determines the median of a 
set by making at most p sequential passes through the input is 
(i.e. asymptotically growing at least as fast as 
This implies that the exact computation of a quantile requires 
U(n) working memory. Therefore a single-pass algorithm, 
required to process streaming data, will necessarily produce 
an estimate and not be able to guarantee the exactness of its 
result. 

Most of the quantile estimation algorithms developed for 
use in practice are not single-pass algorithms and thus cannot 
be applied to streaming data m. On the other hand, many 
single-pass approaches focus on the exact computation of the 
quantile and therefore, as explained previously, demand the 
0{n) storage space which is clearly an unfeasible proposition 
in the context we consider in the present paper; this includes 
the work by Greenwald and Khanna Eol who described an 
0{n) method efficient in the sense that it attains the asymptotic 
minimum in the space requirement as a function of the permis¬ 
sible error in the desired quantile estimate. Amongst the few 
methods described in the literature which satisfy the practical 
constraints of interest in the present paper are the histogram- 
based method of Schmeiser and Deutsch ll25l (with a similar 
approach described by McDermott et al. mi), and the 
algorithm of Jain and Chlamtac II3. Schmeiser and Deutsch 
maintain a preset number of bins, scaling their boundaries 
to cover the entire data range as needed and keeping them 


equidistant. Jain and Chlamtac attempt to maintain a small set 
of ad hoc selected key points of the data distribution, updating 
their values using quadratic interpolation as new data arrives. 
Various random sample methods, such as that described by 
Vitter lIZTl . and Cormode and Muthukrishnan 191, use different 
sampling strategies to fill the available buffer with random 
data points from the stream, and estimate the quantile using 
the distribution of values in the buffer. Lastly, the recently 
proposed algorithm of Arandjelovic et al. El employs an 
adaptable quasi-maximum entropy histogram; their approach 
is discussed further in Section III-BI 

In addition to the ad hoc elements of the existing algorithms 
for quantile estimation on streaming data, which itself is a 
sufficient cause for concern when the algorithms need to be 
deployed in applications which demand high robustness and 
well understood failure modes, it is also important to recognize 
that an implicit assumption underlying these approaches (with 
the exception of the algorithm of Arandjelovic et al. El ; see 
Section |II-B[ ) is that the data is governed by a stationary 
stochastic process. The assumption is often invalidated in real- 
world applications. 

II. Proposed algorithm 

We begin by a formalization of the notion of a quantile, 
follow by a derivation of the key results underlying our 
contribution, and finally a describe the proposed algorithm. 


A. Quantiles 


Let p be the probability density function of a real-valued 
random variable X. Then the g-quantile Vq of p is defined 
as lITSl : 


p{x) dx = q. 


( 1 ) 


Similarly, the g-quantile of a finite set D can be defined as: 


|{a: : x G D and x < Vq}\ < q x \D\. (2) 


In other words, the g-quantile is the smallest value below 
which g fraction of the total values in a set lie. The concept 
of a quantile is thus intimately related to the tail behaviour of 
a distribution. 


B. Challenges of non-stochasticity 

In this work our aim is to develop a method for quan¬ 
tile estimation applicable not only to streams which exhibit 
stationary stochasticity but also to the all-encompassing set 
of streams which includes those with non-stationary data. It 
is a straightforward consequence of potential non-stationarity 
that at no point in time can it be assumed that the historical 
distribution of data values is representative of the future distri¬ 
bution of the stream data. This is true regardless of how much 
historical data has been seen. Thus, the value of a particular 
quantile can change greatly and rapidly, in either direction 
(i.e. increase or decrease). This is illustrated on an example, 
extracted from a real-world data set used for surveillance 
video analysis (the full data corpus is used for comprehensive 


evaluation of different methods in Section |ngi, in Figure 
In particular, the top plot in this figure shows the variation 
of the ground-truth 0.95-quantile which corresponds to the 
data stream shown in the bottom plot. Notice that the quantile 
exhibits little variation over the course of approximately the 
first 75% of the duration of the time window (the first 190,000 
data points). This corresponds to a period of little activity in 
the video from which the data is extracted (see Section |IlI-A| 
for a detailed explanation). Then, the value of the quantile 
increases rapidly for over an order of magnitude - this is 
caused by a sudden burst of activity in the surveillance video 
and the corresponding change in the statistical behaviour of 
the data. 



Fig. 1. An example of a rapid change in the value of a quantile (specifically 
the 0.95-quantile in this case) on a real-world data stream used for surveillance 
video analysis (see Section [III-A). 


It may appear to be the case that to be able to adapt to such 
unpredictable variability in input it is necessary to maintain 
an approximation of the entire distribution of historical data. 
Indeed, this is argued in a recent work which introduced 
the Data-Aligned Maximum Entropy Histogram algorithm for 
quantile estimation from streams El- The method employs 
a histogram of a fixed length, determined by the available 
working memory, which adjusts bin boundary values in a 
manner which maximizes the entropy of the corresponding 
estimate of the historical data distribution. 

Although it is true that the change in the value of a specific 
quantile may be of an arbitrary large magnitude, in this paper 
we show that its specific value in a particular stream is 
nevertheless constrained. Succinctly put, this is a consequence 
of the fact that although the stream data may be considered as 
being drawn from a continuous probability density function 
(which may change with time) the information available to 
our algorithm inherently comprises discrete quanta: individual 
data points. 















C. Constraints: key theoretical results 

Consider a stream of values xi,X 2 , ■ ■ ■ ,Xn- For the time 
being let us assume that there are no repeated values in the 
stream i.e. Vi, j. Xi = Xj => i = j. Then there is an indexing 
function /(...) such that < Xf^(^ 2 ) < ■ • ■ < a^/„(n)- Let 

Xq{n) = Xf(^k) be the current estimate of a particular quantile 
q of interest. Consider Afc, the change in k that the arrival of 
a new datum Xn+i effects. By dehnition given in Equation]^ 

Afc =[(1 - g) X (n+1)J - [(1 - g) X nj. (3) 

Exploiting simple properties of the flood function then leads 
to the following series of inequalities and an upper bound on 
the value of A/c: 

Ak =[(1 - g) X (n + 1)J - [(1 - g) x nj (4) 

<(1 - <?) X (^ +1) - L(1 - ?) X (5) 

=(l-g) X n- [(1-g) X nJ+(l-g) ( 6 ) 

<l + (l-g) = 2-g<2, (7) 

Since Ak has to be an integer: 

Ak < 1. ( 8 ) 

A similar sequence of steps can also give us the lower bound 
on Afc: 

AA: =[(1 - g) X (n+1)J - [(1 - g) X nJ (9) 

>[(1 - g) X (n + 1)J - (1 - g) X n (10) 

= L(1 - <?) X (n + 1)J - (1 - g) X (n + 1) + (1 - g) 

( 11 ) 

>-l + (l-g) = -g, ( 12 ) 

and since Ak has to be an integer: 

Afc>-g>0. (13) 

Einally, combining the two results gives: 

0 < A/fc < 1. (14) 

Thus, rather remarkably at hrst sight, regardless of the value 
of the new datum Xn+i, the change in the index in the sorted 
stream that references the correct quantile value can either 
remain unchanged or increase by one. This shows that while 
the observation made in Section |II-B| that the value of the 
quantile estimate may exhibit an arbitrarily large change, it is 
nonetheless constrained to the specific values of the stream 
just below or just above the previous (current) estimate - 
a consequence of the inherently quantized nature of data 
which comprises the stream (discrete data points). This insight 
motivates to propose the following three key ideas for an 
effective and efficient algorithm: 

• the buffer should store a list of monotonically increasing 
stream values 

• the curi'ent quantile should be as close to the centre of 
the buffer as possible 

• the spread of buffer values should decrease when the 
estimate is unchanging 


Lastly, we we will show in the next section, the assump¬ 
tion of non-repeating data can be removed by employing a 
representation which does not store repeated observations but 
nevertheless keeps track of repetition using an auxiliary data 
structure. 

D. Targeted Adaptable Sample algorithm 

Having laid out the key theoretical results underpinning our 
approach we are now in the position to introduce our quantile 
estimation algorithm. At the heart of the proposed method 
is a data structure which comprises two parts. The first of 
these is an ordered list of data points 6 i < 62 < • ■ ■ < 
where m is the buffer capacity (size), selected from the input 
data stream. The second part of the structure is auxiliary 
information, a sequence a^, 02 , ■ ■ ■, Om of values, associated 
with the selected data. Specifically, for each remembered 
datum bi we also maintain an estimate of the number of 
historical data points whose value is lower than that of the 
datum i.e. after the processing of n data points, is the 
estimate of \{xj \ Xj < bi,j = 1,.. . ,n}|. 

The first m unique data points are simply stored in the 
buffer in the increasing order; the associated auxiliary counts 
ai,, Qm can be computed exactly. With the arrival of each 
new datum Xn+i thereafter, the following sequence of steps 
takes place. Eirstly, if the value of the new datum is already 
present in the buffer, the auxiliary counts corresponding to 
greater buffer values are incremented by one. Otherwise, the 
index k into the buffer of the current quantile is determined 
by hnding the lowest element bk in the buffer such that 
ctk/n > 1 — g, where g is the target quantile. Then if the 
new datum is smaller than the current quantile estimate, i.e. 
Xn+i < bk, and either k < [m/2j or Xn+i > bi, the new da¬ 
tum Xn+i is insetted into the buffer and the largest value in the 
buffer, bjn, discarded. The former case reinforces the central 
positioning of the current quantile estimate, while the latter 
acts so as to decrease the spread of values within the buffer. 
The auxiliary count corresponding to the newly inserted datum 
is initialized by linearly interpolating between the counts of 
buffer values between which the datum is inserted. Auxiliary 
counts corresponding to lower valued buffer elements are 
left unchanged while those corresponding to higher valued 
elements are increased by one. Similarly, if the new datum 
is greater than the current quantile estimate, i.e. Xn+i > bk, 
and either k > [m/2\ or Xn+i < bm, the new datum Xn+i 
is inserted into the buffer and the smallest value in the buffer, 
bi, discarded. 

III. Evaluation 

We now turn our attention to the evaluation of the proposed 
algorithm. In particular, to assess its effectiveness and compare 
it with the algorithms described in the literature, in this section 
we report its performance on three large ‘real-world’ data 
streams. 

A. Real-world surveillance data 

Computer-assisted video surveillance data analysis is of 
major commercial and law enforcement interest. On a broad 




scale, systems currently available on the market can be 
grouped into two categories in terms of their approach. The 
first group focuses on a relatively small, predefined and well 
understood subset of events or behaviours of interest such 
as the detection of unattended baggage, violent behaviour, 
etc El, M- The narrow focus of these systems prohibits 
their applicability in less constrained environments in which a 
more general capability is required. These approaches tend to 
be computationally expensive and error prone, often requiring 
fine tuning by skilled technicians. This is not practical in many 
circumstances, for example when hundreds of cameras need 
to be deployed as often the case with CCTV systems oper¬ 
ated by municipal authorities. The second group of systems 
approaches the problem of detecting suspicious events at a 
semantically lower level ifTSl . Il22l . lITSl . ||2, |H. Their central 
paradigm is that an unusual behaviour at a high semantic 
level will be associated with statistically unusual patterns (also 
‘behaviour’ in a sense) at a low semantic level - the level 
of elementary image/video features. Thus methods of this 
group detect events of interest by learning the scope of normal 
variability of low-level patterns and alerting to anything that 
does not conform to this model of what is expected in a 
scene, without ‘understanding’ or interpreting the nature of 
the event itself. These methods uniformly start with the same 
procedure for feature extraction. As video data is acquired, 
firstly a dense optical flow field is computed using the well- 
known method of Lucas and Kanade ini. Then, to reduce 
the amount of data that needs to be processed, stored, or 
transmitted, a thresholding operation is performed. This results 
in a sparse optical flow field whereby only those flow vectors 
whose magnitude exceeds a certain value are retained; non¬ 
maximum suppression is applied here as well j^ . Normal 
variability within a scene and subsequent novelty detection are 
achieved using various statistics computed over this data. The 
data streams, shown partially in Figure correspond to the 
values of such statistics (their exact meaning is proprietary 
and has not been made known fully to the authors of the 
present paper either). Observe the non-stationary nature of the 
streams which is evident both on the long and short time scales 
(magnifications are shown for additional clarity and insight). 
Table provides a summary of some of the key features of 
the three data sets acquired in the described manner and used 
for the evaluation in this paper. 


TABLE I 

Key statistics of the three real-world data sets used in our 

EVALUATION. 


Data set 

Data points 

Mean value 

Standard deviation 

Stream 1 

555,022 

7.81 X 10^° 

1.65 X 10“ 

Stream 2 

10, 424, 756 

2.25 

15.92 

Stream 3 

1,489,618 

1.51 X 10® 

2.66 X 10® 


B. Results 


We compared the performances of our algorithm and the 
four alternatives from the literature described in Section II-AI 
(i) the algorithm of Jain and Chlamtac ifT^ . (ii) the 
random sample based algorithm of Vitter lIZTl . (iii) the uniform 
adjustable histogram of Schmeiser and Deutsch ll25l . and (iv) 
the data-aligned maximal entropy histogram of Arandjelovic 
et al. 0, 0. A representative summary of results is shown 
in Table It can be readily observed that our method and 
the method of Arandjelovic et al. significantly outperformed 
other approaches. The and equispaced histogram based 
algorithms performed worst, often producing highly inaccurate 
estimates. The random sample algorithm of Vitter performed 
relatively well but still substantially worse than the top two 
methods. It is interesting to note that the data-aligned maximal 
entropy histogram of Arandjelovic et al. outperformed the 
proposed method. At first we found this highly surprising 
given that this algorithm approximates the entire distribution 
of historical data whereas ours, by design, narrows its focus 
to the more relevant part of the distribution. We hypothesized 
that the reason behind this is that the quantile we sought 
to estimate was insufficiently challenging (not close enough 
to 1, relative to buffer size). Specifically, our hypothesis 
stems from the observation that some information is lost by 
interpolation every time a new datum is added to our buffer. 
While interpolation is also employed by Arandjelovic et al., 
when the target quantile is not particularly challenging relative 
to the buffer size, the number of interpolations performed by 
the simple data-aligned maximal entropy histogram is lower 
and its underlying model sufficiently flexible to produce an 
accurate estimate. Consequently, we hypothesized that the 
advantages of our method would only be fully exhibited for 
higher quantiles (needed in applications such as customer 
wallet estimation EB) and we sought to investigate that next. 

In the second set of experiments we compared our method 
with the data-aligned maximal entropy histogram of Arand¬ 
jelovic et al. using a series of progressively challenging target 
quantiles. A summary of the results is shown in Table III 
It is readily apparent that this set of results fully supports 
our hypothesis. While our algorithm showed an improvement 
in performance as the value of the target quantile was in¬ 
creased, the opposite was true for the data-aligned maximal 
entropy histogram which performed progressively worse. Data 
set 3 again proved to be the most challenging one, the 
data-aligned maximal entropy histogram producing grossly 
inaccurate estimates for quantile values of over 0.99. For 
example, on stream 3 for the target quantile of 0.999 the 
data-aligned maximal entropy histogram achieved the average 
relative Li error of 368.6%, while the proposed algorithm 
showed remarkable accuracy and the error of 1.6%. The same 
observations can be made by considering the absolute Loo 
error i.e. the greatest error in the running quantile estimates, 
which were respectively 2.35e7 and 3.35e6 - a difference of 
approximately an order of magnitude. 


Table III also includes a column (right-most) showing the 








TABLE II 

Comparative experimental results for 0.95-quantile. 




Stream 1 

Stream 2 

Stream 3 

Method 

Bins 

Relative 

Absolute 

Relative 

Absolute 

Relative 

Absolute 



Li error 

Lao error 

Li error 

Lao error 

Li error 

Lao error 

Targeted adaptable 

500 

2.1% 

l.OOell 

4.7% 

24.20 

5.2% 

4.8e5 

sample (proposed) 

100 

1.6% 

1.07ell 

9.2% 

54.73 

3.6% 

2.89e5 

Data-aligned max. 

500 

1.2% 

S.llelO 

0.0% 

2.04 

0.1% 

8.11e4 

entropy histogram fS) 

100 

9.6% 

2.06ell 

0.0% 

1.91 

2.6% 

3.33e5 

algorithm ifl^ 

n/a 

15.7% 

2.77ell 

3.1% 

93.04 

84.2% 

1.55e6 

Random sample ll27l 

500 

4.6% 

1.98ell 

0.7% 

38.00 

10.4% 

5.95e5 

Equispaced histogram ll25l 

500 

87.1% 

1.07el2 

0.1% 

80.29 

675.1% 

4.39e7 


TABLE III 

Comparison oe the top two algorithms for high-value quantiles using 100 bins. 


Data set 

Quantile 

Proposed method 

Data-aligned histogram 

Max 

value to 

quantile 

ratio 

Relative Absolute 

Li error Lao error 

Relative Absolute 

Li error Lao error 

Stream 1 

0.9500 

0.9900 

0.9950 

0.9990 

0.9995 

1.6% 1.07ell 

1.2% 9.59el0 

2.1% 9.27el0 

0.7% 9.80el0 

0.3% 2.69el0 

9.6% 2.06ell 

27.9% 5.69ell 

58.8% 8.48ell 

48.0% 9.47ell 

36.8% 8.72ell 

15.8 

5.9 

4.2 

2.1 

1.5 

Stream 2 

0.9500 

0.9900 

0.9950 

0.9990 

0.9995 

9.2% 54.73 

2.4% 26.31 

0.3% 6.21 

0.2% 16.05 

0.2% 20.17 

0.0% 1.91 

0.3% 2.45 

0.2% 4.59 

0.4% 30.29 

2.0% 34.44 

30.1 

2.5 

1.8 

1.4 

1.3 

Stream 3 

0.9500 

0.9900 

0.9950 

0.9990 

0.9995 

3.6% 2.89e5 

1.2% 3.32e6 

1.8% 1.40e6 

1.6% 3.35e6 

4.2% 1.30e7 

2.6% 3.33e5 

2.4% 3.25e5 

480.5% 1.63e8 

368.6% 2.35e7 

364.2% 2.34e8 

520.3 

122.7 

60.9 

11.7 

7.2 
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(a) Data stream 1 




Fig. 2. The three large ‘real-world’ data streams used in our evaluation. 



Fig. 3. An example running ground truth of the target quantile {q = 0.95) 
and the estimates of our algorithm for different bin sizes on data stream 3. It is 
remarkable to observe that our method achieved a consistently highly accurate 
estimate even when the available buffer capacity was severely restricted (down 
to only 12 bins). 


ratio of the maximal stream value and the ground truth for 
the target quantile. We sought to examine if a particularly 
high ratio predicts poor performance of the data-aligned max¬ 
imal entropy histogram, which may be expected given that 
throughout its operation the algorithm approximates the entire 
distribution of historical data. We found this not to be the 
case which can be explained by the allocation of bin ranges 
according to the maximum entropy principle and the alignment 
of the bin boundaries with data; please see the original 
publication for a detailed description of the method |[3l. 

Lastly, we sought to analyse the performance of the pro¬ 
posed method in additional detail. Figure shows on an 
example the running ground truth of the target quantile (q = 
0.95) and the estimates of our algorithm for different bin 
sizes on the most challenging data stream 3. It is remarkable 
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Fig. 4. Our algorithm is highly successful in achieving one of the key ideas 
behind the method, that of adapting the data sample retained in the buffer 
so as to maintain the position of the current quantile estimate in the buffer 
as close to its centre as possible (see Section Ill-Cf Both in the case of a 
buffer with the capacity of 100 and 12 (the results for only two buffer sizes 
are shown to reduce clutter), the central positioning of the quantile estimate 
is maintained very tightly throughout the processing of the stream. 



to observe that our method consistently achieved a highly 
accurate estimate even when the available buffer capacity 
was severely restricted (to 12 bins). In Figure the same 
example run was used to illustrate the success of our algorithm 
in achieving one of the key ideas behind the method, that 
of adapting the data sample retained in the buffer so as to 
maintain the position of the current quantile estimate in the 
buffer as close to its centre as possible (see Section II-Ci. 
As the plot clearly shows, both in the case of a buffer with 
the capacity of 100 and 12 (the results for only two buffer 
sizes are shown to reduce clutter), the central positioning of 
the quantile estimate is maintained very tightly throughout 
the processing of the stream. Similarly, the success of our 






































































3 


1.6 


2.5 


2 

1.5 

1 


-0.95-quantile ground truth 

Cumulative data maximum 
-Buffer spred (@ size 12) 



jl . 

Datum index 



10 



Fig. 5. The success of our algorithm in achieving tight sampling of the 
data distribution around the target quantile - unlike the random sample based 
algorithm of Vitter (m or the uniform adjustable histogram of Schmeiser and 
Deutsch ED which retain a sample from a wide range of values, our method 
utilizes the available memory efficiently by focusing on a narrow spread of 
values around the current quantile estimate. While the spread of values in the 
buffer experiences intermittent and transient increases when there is a burst of 
high valued data points in preparation for a potentially large quantile change, 
thereafter it quickly adapts to the correct part of the distribution. 


Fig. 7. The variation in the accuracy of our algorithm’s estimate with 
the buffer size on data stream 3. Unlike any of the existing algorithms, 
our method exhibits very gradual and graceful degradation in performance, 
and still achieves remarkable accuracy even with a severely restricted buffer 
capacity. 


algorithm in achieving tight sampling of the data distribution 
around the target quantile is illustrated in the plot in Figure 
This plot shows that unlike the random sample based algorithm 
of Vitter ll27l or the uniform adjustable histogram of Schmeiser 
and Deutsch which retain a sample from a wide range of 
values, our method utilizes the available memory efficiently 
by focusing on a narrow spread of values around the current 
quantile estimate. Note that the spread of values in the buffer 
experiences intermittent and transient increases when there 
is a burst of high valued data points in preparation for a 
potentially large quantile change, but thereafter quickly adapts 
to the correct part of the distribution. The variation in the 
mean buffer spread with the buffer size and target quantile 
is shown in Figure Lastly, the variation in the accuracy of 
our algorithm’s estimate with the buffer size is analysed in 
Figure Unlike any of the existing algorithms, our method 
exhibits very gradual and graceful degradation in performance, 
and still achieves remarkable accuracy even with a severely 
restricted buffer capacity. 



Fig. 6. The variation in the mean buffer spread with the buffer size and 
target quantile on data stream 3. 


IV. Summary and conclusions 

In this paper we described a novel algorithm for the estima¬ 
tion of a quantile of a data stream when the available working 
memory is limited (constant), prohibiting the storage of all 
historical data. This problem is ubiquitous in computer vision 
and signal processing, and has been addressed by a number of 
researchers in the past. We showed that a major shortcoming of 
the existing methods lies in their usually implicit assumption 
that the data is being generated by a stationary process. This 
assumption is invalidated in most practical applications, as we 
illustrated using real-world data. 

Evaluated on three large data streams extracted from CCTV 
footage, our algorithm was vastly superior in comparison 
with the existing alternatives. The highly non-stationary nature 
of the data was shown to cause major problems to pre¬ 
vious methods, often leading to grossly inaccurate quantile 
estimates; in contrast, our method was virtually unaffected 
by it. What is more, our experiments demonstrate that the 
superior performance of our algorithm can be maintained 
effectively while drastically reducing the working memory size 
in comparison with the methods from the literature. 
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